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Monitoring forest cover changes is an important task for forest resource 
management and planning. In this context, remote sensing images have 
shown a high potential in forest cover changes detection. In Vietnam, 
although the existence of a large number of such images and ground-truth 
labels, current researches still relied on classical methods employed manual 
indices, such as multi-variant change vector analysis (MVCA) and 
normalized difference vegetation index. These methods highly require 
domain knowledge to determine threshold values for forest change that are 
applicable only for studied areas. Therefore, in this paper, we propose a 
method to detect coastal forest cover changes, which can exploit available 
dataset and ground-truth labels. Moreover, the proposed method does not 
require much domain knowledge. We used multi-temporal Sentinel-2 
imagery to train a segmentation model, that is based on the U-Net network. 
It was used then to detect forest areas at the same location taken at different 


U-Net times. Lastly, we compared obtained results to identify forest disturbances. 
Experimental results demonstrated that our method provided a high accuracy 
of 95.4% on the testing set. Furthermore, we compared our model with the 
MVCA method and found that our model outperforms this popular method 
by 3.8%. 
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1. INTRODUCTION 

Coastal forests are an important part of tropical biodiversity. They provide a lot of important 
services for our ecosystem, such as extreme weather protection, erosion prevention, environments for 
different species, and storage of blue carbon, which allow to mitigate climate change [1]. However, these 
forests are increasingly vulnerable to degradation as a result of climate change, sea- level rise or 
anthropogenic processes such as deforestation [2]. To address these issues, accurate and automated forest 
cover monitoring is crucial [3]. In this context, high-resolution remote-sensing images collected from 
satellites, such as the European Sentinel-2A, -2B, or LandSat-8, offer potential and cost-efficient sources for 
an automatic solution [4]. 

Most previous studies focused on traditional methods using hand-crafted features, such as 
multi-variant change vector analysis (MVCA), normalized difference vegetation index (NVDI), and so on 
[1], [2], [5]-[9]. They have drawbacks that prevent their wide application, especially for non-domain experts 
in forestry and remote sensing technology. On the one hand, they require more effort and time due to the 
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excessive dependence on handcrafted features. On the other hand, they are ad-hoc solutions that are suitable 
only for specific regions. Therefore, these methods are time-consuming and inefficient. 

Recently, with the development of deep learning technology, the field of object detection in remote 
sensing images has made significant progress. Deep neural networks allow an automatic feature extraction, 
avoid feature selection and reduce manual steps in monitoring forest cover change [10]-[12]. Convolutional 
neural networks (CNNs) are one of the well-known deep learning algorithms that have been widely used in 
remote sensing image classification. They allow us to extract more meaningful features, the classification of 
these images usually results in higher performance [13]. 

For example, de Bem ef al. [10] presented a method that used CNN and Landsat data for 
deforestation detection in the Brazilian Amazon. The authors applied three CNN architectures including 
U-Net, ResUnet, and SharpMask to classify the change between the years of 2017-2018 and 2018-2019. The 
experiment results show that the network achieved a high accuracy, without any post-processing for noise 
cancelling. Stoian et al. [11] also proposed an application of CNN to build land cover maps using high- 
resolution satellite image time series. Based on data from Sentinel-2 L2A, the U-Net network was applied in 
this study to deal with sparse annotation data while maintaining high-resolution output. Such networks are 
even applicable with incomplete satellite imagery in similar problems. For instance, Khan et al. [14] detected 
forest cover changes over 29 years (1987-2015), in which the authors faced issues of incomplete and noisy 
data. By using a deep CNN network, they mapped the raw data to more separable features. These features 
were employed to detect the changes. Many similar applications can be found in the literature, such as the 
works of [12], [15]—[20] and so on. 

We are interested in monitoring forest cover change using deep learning. Numerous works, which 
applied deep neural networks, such as CNN, U-Net, and satellite images to detect forest loss areas, have been 
proposed worldwide [10], [11], [14], [21], [22]. However, in Vietnam, traditional machine learning is still 
widely used. In this paper, we proposed a method for coastal forest cover change detection in Vietnam. Based 
on sensing images from the European Sentinel-2A and -B, we trained a U-Net model to detect forest and 
non- forest areas. We then combined geographic information systems (GIS) information to compute the 
forest cover changes and evaluated results with available information from the national forest monitoring 
system. The proposed method is capable of applying to different areas, with less effort from domain experts. 

The paper is organized as: section 2 introduces our research method. Section 3 presents the 
experimental results and discussion. Lastly, section 4 concludes the work conducted and proposes some 
future works. 


2. RESEARCH METHOD 
2.1. Method overview 

The main objective of this study is to automatically detect and calculate coastal forest cover changes 
of Hai Phong city, Northern Vietnam. We performed pixel-level semantic segmentation on Sentinel-2A and 
2B images, to classify forest and non-forest areas. These images were chosen from the same areas between 
two periods times. Therefore, combining with GIS information, we can detect and calculate forest cover 
changes. To do so, we realized three big steps, as presented in Figure 1, including: 


a Forest Cover Cover Change - Forest Loss 
Model Training *| Detection : 


Figure 1. The proposed method 


— Model training: in this first step, we trained a semantic segmentation model, which was based on U-Net 
neural network. The training dataset came from the Sentinel-2. We also evaluated the trained model 
using the forest cover layers extracted from the national forest monitoring system (FRMS) of Vietnam. 

— Forest cover detection: after training, we applied the model to classify forest and non-forest areas of 
images of the same location taken in different times. 

— Cover change analysis: lastly, based on forest covers at the two different times, we detected and 
calculated changes. 

The model training consists of data preparation, model training, and testing. While the last two steps 
are composed of model using and GIS analysis. At each of these steps, we applied related techniques in deep 
learning, satellite image processing, GIS, and so on. The following sections will detail these steps. 
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2.2. Model training process 

We adapted a traditional deep learning procedure, as shown in Figure 2 to train our model. Since 
remote sensing images are more complex and blurrier than others, we should perform several data 
preparation steps to clean and normalize the input data. Furthermore, to have an objective result, we based on 
real data, extracted from FRMS, to evaluate the trained model. This system supports state management in 
monitoring forest cover changes. The data is manually and regularly updated by Vietnamese local forest 
rangers, through a quantum geographic information system (QGIS) plug-in, developed by the development of 
a management information system for the forestry sector in Viet Nam (FORMIS) phase II project [23]. In 
short, after the data preparation step, we obtained two types of data: i) forest satellite images that were 
obtained after a series of data collection and pre-processing steps (more detail in the next section) and ii) 
forest cover layers that were extracted from the FRMS system and manually verified. 


Forest satellite images 


——— 
}-—*| Model Training }*——»| Model testing 
Preparation 


Forest cover layers 


Figure 2. Model training process 


The model was based on U-Net network architecture [24]. We used satellite images for model 
training. While, the information extracted from FMRS (forest cover layers), combined with GIS, was 
employed for model evaluation. The following section will detail the data preparation step. 


2.3. Data preparation 

We collected satellite images of Hai Phong city from the Sentinel-2 MSI: Multispectral Instrument, 
Level-2A [25] dataset available from March 28, 2017. Hai Phong is a port city, which locates in northern 
Vietnam, between 20030’N + 21001’N and 106023’E + 107008’E. The North borders with Quang Ninh 
province; Hai Duong province in the West; Thai Binh province in the South; and the East Sea in the east. The 
city possesses 3 a long mangrove coastal forest, with a total area of 26.127,58 hectare. 

Since techniques to capture remote sensing and natural optical images are different, there are several 
challenges while working with satellite images. Therefore, several pre-processing steps should be performed 
before model training, as illustrated in Figure 3. First, we selected suitable scene images from Sentinel-2. For 
this purpose, remote sensing image processing was performed (the upper process in Figure 3): 


Forest satellite 
images 
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Figure 3. Data preparation and pre-processing 
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— Scene detection: we selected only image scenes at the coastal and mangrove forest of Hai Phong city. 
Then, we filtered and kept only images captured in 2018 and 2019. Lastly, the images with cloud rate 
greater than 30% were eliminated. After this step, we obtained 26 and 32 images captured in 2018 and 
2019 respectively. 

— Band selection: sentinel 2 have 13 spectral bands, with different bandwidth and spatial resolution. In 
this study, we directly used ten bands for input features, including the bands from 2 to 8, 8A, 11 and 12 
with wavelength of 0.490 um, 0.560 um, 0.665 um, 0.705 um, 0.740 um, 0.783 um, 0.842 um, 
0.865 um, 1.610 um, and 2.190 um. The bands 1, 9, and 10 were ignored because they are not relevant 
to vegetation [26]. Moreover, we also computed three indices: normalized difference vegetation index 
(NDVD, normalized difference snow index (NDSI), normalized difference water index (NDWI), which 
are widely applied in similar problems. They are computed as in (1). 


NIR-Red | NDSI = Green—SWIR -NDWI _ NIR-SWIR (1) 


NIR+Red’ Green+SWIR NIR+SWIR 


NDVI = 


where near infrared reflectance (NIR) is band 8, Red is band 4, Green is band 3, and short-wave infrared 
(SWIR) is band 11. 

— Cloud free: we removed the cloud using the QA60 band, which is a bitmask band with cloud mask 
information [27]. Since bits 10 and 11 specify clouds and cirrus, we could filter all cloudy pixels. 
Figure 4(a) and Figure 4(b) show an example of selected images before and after cloud-free. 

— Median value calculation: to improve the quality of images, we applied a median filter that moves 
through the image pixel by pixel, and replaced each value with the median value of neighboring ones. 

— Image cropping: at this step, we cropped images to focus only on studied areas. 
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Figure 4. Satellite images before (a) and after (b) cloud free 


These scene images were then combined with forest cover layers extracted from FRMS to build a 
labeled pixel-level dataset for model training, as shown in the bottom process in Figure 3. We extracted four 
important pieces of information from FRMS, including administrative information, coordinates, forest 
observations (0 for non-forest, 1 for forest), detailed plot information. Since the forest cover layers were 
manually entered to FRMS by local rangers, therefore we conducted several field trips to verify the ground- 
truth labels. Based on the available resources, we selected a number of sample points to manually check if the 
information is correct (forest or non-forest). After verification, we got 1,500 sample points with correct 
labels. Centering on these points, two corresponding neighborhood patches were created, including i) image 
patches of size 256x256x13 from cropped satellite images and ii) forest cover layer of size 256x256, as 
presented in Figure 3. Finally, we obtained a dataset of size 256x256x14 for model training. 


2.4. U-net neural networks 

In this study, we applied U-Net which is a convolutional network for multi-class image 
segmentation [24]. It supports the per-pixel classification that allows us to predict the class of each pixel. We 
adapted the architecture proposed in [28] with fewer filters since our training set is limited, which also 
prevents over-fitting, as shown in Figure 5. Since the input size is 256x256x14, thus we have adapted the 
network architecture accordingly. Sigmoid activation functions were used to ensure that output pixel values 
range between 0 and 1. 


Coastal forest cover change detection using satellite images and ... (Khanh Nguyen-Trong) 


934 im) ISSN: 2252-8938 


2.5. Training and validation setup 

We split the collected data into three datasets: the training set containing 1,000 image patches, the 
validation set containing 300 patches, and the testing set containing 200 patches. The model was trained 
using binary cross-entropy as loss function, Adam optimizer (e=10—7, B1=0.9, and B2=0.999), a mini-batch 
with a size of 100, and early-stopping criteria on the validation set. Before creating the batches, we also 
shuffled data, which helps our model to lean it better with more objective results. To evaluate the 
experiments, the Fl score, precision, recall, and accuracy were applied. The TensorFlow framework 2.2.0, 
Keras 2.3.1, Python 3.6, Tesla K80 GPU, and Intel Xeon (R) were used to implement our model. 
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Figure 5. U-Net architecture [28] 


2.6. Forest cover change analysis 

After training and validating, we determined forest cover changes, as illustrated in Figure 6. The 
trained model detected forest and non-forest areas of images captured at different times on the same location. 
Obtained results were then compared to identify the cover changes. Combining with GIS information 
extracted from FRMS, we can calculate area changes. 
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Figure 6. Forest cover change analysis 
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2.7. MVCA method 

For performance evaluation, we compared the proposed method with MVCA that is widely used in 
Vietnam [7], [29]. This method is based on NDVI and NDSI of the beginning (NDVI, NDSI,) and ending 
(NDVI,, NDSI,) period to calculate two change vectors, as shown in (2) and (3). Then, with the help of 
expert knowledge, the method uses two thresholds to determine forest loss. In this study, there is forest loss if 
Changelndex1 > 48 and ChangelIndex2 > 16.8. 


ChangeIndex1 = .{(NDVI, — NDVI,)2 + (NDSI, — NDSI,)? (2) 


ChangelIndex2 = (NDVI, — NDVI.) + (NDSI, — NDSIy) (3) 


3. RESULTS AND DISCUSSION 

With early stopping, the training stopped at the 14" epoch. Figure 7(a) and Figure 7(b) show the 
model training progress over time in terms of accuracy and loss. The training and validation accuracy 
increase while training and valuation loss decrease as the number of training iterations increases. The gap 
between the curves is also small which indicates that no overfitting occurs. The model achieved a high 
accuracy of 97.7% on the validation set and 96.4% on the testing set. This high performance can be explained 
by the fact that the spectral and textural features of forest cover on RGB images are differentiable by the 
human eye, as presented in Figure 8. Due to the imbalance of labeled pixels, the precision, recall, and Fl 
score are 87.5%, 89.3%, and 87.2%, which are lower than the accuracy. 
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Figure 7. Progress of accuracy (a) and loss (b) on the training and validation set 


To detect forest cover changes, we applied the trained model to images of the same location taken in 
2018 and 2019. The obtained results were then used to detect and calculate forest cover changes, as shown in 
Figure 8. The model accurately detected forest areas at the beginning and ending period (2018 - Figure 8(a), 
and 2019 - Figure 8(b)). Then, we mapped the two results and performed several GIS operations to get the 
forest cover changes, as detailed in Figure 8(c). According to the policy of the Vietnam government, an 
increase or decrease of forest covers, which is greater than 0.3 ha, will be considered to be a change. 
Therefore, we calculated and detected five forest loss areas, as presented in Figure 8(d) (red parts). The 
results were similar to those reported by local rangers in 2019. Therefore, our model is capable of accurately 
detecting forest cover changes. 

Compared with existing methods that are widely applied in Vietnam, our proposed method is more 
robust and more accurate in forest cover detection. Experimental results show that our method outperforms 
MVCA by 3.8% (91.6% on the testing set). It allows detecting a higher level of forest disturbances, as shown 
in Figure 9. Figure 9(a) and Figure 9(b) shows forest cover changes (the white part and yellow part), 
predicted by MVCA and our proposed method, respectively. Our method produced results that are closer to 
the real data reported by local rangers. 
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Moreover, the proposed method requires less expert knowledge than methods based on NVDI, 
NVSI as in [1], [2], [7], [9], [29]. For these methods, domain experts are highly required to determine 
threshold values that are applicable only for a specific area. In contrast, our method does not require these 
thresholds. The model automatically learns useful features from input data to detect forest cover. 
Furthermore, as a deep learning model, our model can be incrementally trained with the new target areas. It 
means that the model can be gradually provided with new samples to update its weights and thus improve its 
classifications with time. 


Quang Ninh 


province 


Figure 8. Comparing forest cover (dark green part) in (a) 2018 and (b) 2019 to compute (c) all forest changes 
(yellow part) and (d) the ones greater than 0.3 ha (red part) 
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Figure 9. Forest and forest plot cover change prediction by (a) MVCA (green: forest covers; white: forest 
cover change) and (b) U-Net (green: forest covers; yellow: forest cover change) 
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Despite their advantages, the proposed method is not as easy to implement as MVCA and similar 
methods based on thresholds. It requires, on the one hand, a relatively large quantity of samples, and on the 
other hand, ground truth masks that can be challenging and time-consuming. Whereas the MCVA and similar 
methods work with simpler sampling schemes and can produce reasonably acceptable results. However, in 
Vietnam thanks to FRMS, we already have ground-truth labels that are regularly entered by local rangers. 
Therefore, the proposed method is able to be widely applied for automatically monitoring forest covers. 


4. CONCLUSION 

In this study, a deep learning based-method for coastal forest cover change detection has been pro- 
posed. We used multi-temporal Sentinel-2 imagery to train a segmentation model based on U-Net neural net- 
work. Furthermore, we evaluated the model with forest cover information extracted from the national forest 
resource monitoring system of Vietnam. The results shown that our method achieved a good performance on 
remote sensing images. The trained model achieved a high accuracy of 95.4% on the testing set and 
outperformed the popular methods based on thresholds in Vietnam. Future works will focus on tree species 
classification by improving the network architecture, increasing our dataset and proposing augmentation 
methods for forest cover images. 
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